This notebook provides a number of examples how to plot the outputs of the data quality assessments on OpenStreetMap in osm-hydro. We also provide some functionality to plot spatial data (saved in GeoJSON files) interactively. This helps to inspect the data in detail in a graphical map environment.
%matplotlib inline
import matplotlib.pyplot as plt
import os
import pandas as pd
import numpy as np
Below, we define all the functions to plot data. Please run these first! The plot routines all use example data from the "sample_data" folder in the github repository.
def plot_dm_dq(ax, df, title, set_xlabels=True, colors=['g', 'y', 'orange', 'r']):
"""
Plots a staked bar plot from a pandas dataframe.
Inputs:
ax: matplotlib axis to plot on
df: pandas dataframe (indexes are used as legend)
title: axis title
set_xlabels=True: whether to plot x-axis labels
colors=['g', 'y', 'orange', 'r']: list of colors
Returns:
artists: list of handles to bar objects, which can be used for a legend
"""
width = 0.35
names = df.keys()
index = np.arange(len(names))
bottom = np.zeros(len(index))
artists = []
for n, col in zip(df.iterrows(), colors):
count = np.array(n[1:][0], dtype='int') # get the numbers of features in an array
artist = ax.bar(index, count, width, bottom=bottom, color=col, linewidth=0)
artists.append(artist[0])
bottom += np.atleast_1d(count)
ax.set_xticks(np.atleast_1d(index) + width/2.)
if set_xlabels:
ax.set_xticklabels(tuple(names), rotation='vertical')
else:
ax.set_xticklabels('')
ax.set(ylabel='', title=title)
return artists
def dq_data_model_bar_plot(xls_fn):
"""
Reads in an excel file with data model quality results
A graph for each sheet is prepared. Each sheet contains results for a geographical area
"""
def fn_template(fn, suffix='_{:s}', extension='.png'):
"""
Strips a file name from its extension and provides a template for making a set of files
"""
return str(os.path.splitext(fn) + suffix + extension).format
xls = pd.ExcelFile(xls_fn)
nplots = len(xls.sheet_names)
nrows = int(np.round(np.sqrt(nplots)))
ncols = int(np.ceil(float(nplots)/nrows))
fig = plt.figure(figsize=(16,8))
plt.subplots_adjust(bottom=0.2, hspace=0.25)
for n, name in enumerate(xls.sheet_names): # [0:1]
df = xls.parse(name).set_index('validation')
ax = plt.subplot(nrows, ncols, n + 1)
if n >= nplots-ncols:
set_xlabels=True
else:
set_xlabels=False
artists = plot_dm_dq(ax, df, name, set_xlabels)
fig.legend(artists, df.index, loc='upper center', bbox_to_anchor=(0.4, 0.05),
fancybox=True, shadow=True, ncol=4)
return fig
Below we demonstrate how to plot the results of the check on the data model quality. The results are reported in excel files that contains one sheet per geographic area checked. In our example, we have checked for all wards of Ramani Huria in Dar Es Salaam. Hence we make a subplot for the tag completeness of ewach individual ward. We use the functions defined above to prepare these plots from the excel file. Basically, we only need to provide an excel file, and we receive a figure with all the results back.
report_folder = os.path.abspath(os.path.join('..', 'sample_data', 'report_files'))
xls_fn = os.path.join(report_folder, 'data_model_channels_report.xlsx')
# function below does the actual plotting
fig = dq_data_model_bar_plot(xls_fn)
# save figures as nice looking PDF and PNG files
fig.savefig(os.path.join(os.path.split(xls_fn)[0], 'data_model_channels.pdf'))
fig.savefig(os.path.join(os.path.split(xls_fn)[0], 'data_model_channels.png'), bbox_inches='tight', dpi=300)
We have a check for completeness of waterway and highway crossings. This check provides per geographical area how many features contains crossing information (versus how many don't) and counts for the amount of crossings of a given type. Both the validity of crossing information and the types are plotted below. We do this by reading the excel file and make a stacked bar plot for correctly mapped versus missing crossing information, and a stacked bar plot for the different types of crossings.
# read the report file
xls_fn = os.path.join(report_folder, 'crossings_report.xlsx')
df = pd.read_excel(xls_fn).set_index('validation')
df
# now we make a plot.
fig = plt.figure(figsize=(10,4))
fig.subplots_adjust(bottom=0.4)
ax = fig.add_subplot(121)
artists_corr = plot_dm_dq(ax, df[0:2], 'Correct', colors=['g', 'r'])
ax = fig.add_subplot(122)
artists_type = plot_dm_dq(ax, df[2:], 'Type', colors=['b', 'orange'])
artists = artists_corr + artists_type
fig.legend(artists, df.index, loc='upper center', bbox_to_anchor=(0.42, 0.1),
fancybox=True, ncol=4, fontsize=8.)
fig.savefig(os.path.join(os.path.split(xls_fn)[0], 'crossings.pdf'))
fig.savefig(os.path.join(os.path.split(xls_fn)[0], 'crossings.png'), bbox_inches='tight', dpi=300)
Below we will show how to read in the GeoJSON files generated by the data quality tools and plot these in a interactive map environment. We use the bokeh library for this. First we need to import a number of additional packages.
import fiona
import shapely.geometry as geom
from bokeh.plotting import figure, show
from bokeh.models.tiles import WMTSTileSource
from bokeh.io import output_notebook, push_notebook
from pyproj import Proj, transform
from bokeh.models import ColumnDataSource, HoverTool, Div, ColorBar, LinearColorMapper, TapTool, CustomJS
from bokeh.palettes import Category10, Category20, Plasma
Now we prepare some functions to read the features with the fiona library, and convert the features' coordinates (which are in latitude - longitude WGS84 projection, having EPSG code 4326) into a Web Mercator projected system (EPSG code 3857). The functions return a ColumnDataSource, which can be directly plotted by bokeh in an interactive map
def make_point_cds_WMTS(point_feats):
outProj = Proj(init='epsg:3857')
inProj = Proj(init='epsg:4326')
df = {}
df['x'], df['y'] = zip(*[transform(inProj, outProj, geom.shape(c['geometry']).xy[0][0], geom.shape(c['geometry']).xy[1][0]) for c in point_feats])
for key in point_feats[0]['properties'].keys():
df[key] = [c['properties'][key] for c in point_feats]
return ColumnDataSource(df)
def make_line_cds_WMTS(line_feats):
outProj = Proj(init='epsg:3857')
inProj = Proj(init='epsg:4326')
df = {}
df['x'], df['y'] = zip(*[transform(inProj, outProj, geom.shape(c['geometry']).xy[0].tolist(), geom.shape(c['geometry']).xy[1].tolist()) for c in line_feats])
for key in line_feats[0]['properties'].keys():
df[key] = [c['properties'][key] for c in line_feats]
return ColumnDataSource(df)
We provide a map with waterway line elements filtered out, and plotted with colors, representing their validity for one specific tag. In this case we select the tag "depth" as an example. This can be easily replaced by other tags such as "width", "covered", "blockage" by a user. The resulting graph is a
report_folder = os.path.abspath(os.path.join('..', 'sample_data', 'report_files'))
data_model_dq_fn = os.path.join(report_folder, 'data_model_channels_geo.json')
ds = fiona.open(data_model_dq_fn)
lsource = make_line_cds_WMTS(ds)
ds.close()
output_notebook()
# user input, define the tag to check!
tag = 'depth'
# define colors for the different values
colors = [Category20[8][4],
Plasma[3][2],
Category20[8][2],
Category20[8][6]
]
dq_color_mapper = LinearColorMapper(low=0, high=3, palette=colors)
flag_tag = tag + '_flag'
# we use the Ramani Huria drone background map as placeholder for our data...but you can also use OpenStreetMap.
# We provide a placeholder for that below as well
url='http://b.tiles.mapbox.com/v3/worldbank-education.pebkgmlc/{z}/{x}/{y}.png'
# this one is not used, unquote the bkg_tile = WMTSTileSource(url=url) to visualize OSM instead of the drone data
bkg_tile = WMTSTileSource(
url='http://c.tile.openstreetmap.org/{Z}/{X}/{Y}.png',
attribution=(
'Map tiles by <a href="http://stamen.com">Stamen Design</a>, '
'under <a href="http://creativecommons.org/licenses/by/3.0">CC BY 3.0</a>.'
'Data by <a href="http://openstreetmap.org">OpenStreetMap</a>, '
'under <a href="http://www.openstreetmap.org/copyright">ODbL</a>'
)
)
bkg_tile = WMTSTileSource(url=url)
basic_tools = 'pan, wheel_zoom,box_zoom,reset'
tooltips=[
(key, '@{:s}'.format(key)) for key in lsource.to_df().keys() if key not in ['x', 'y'] if '_flag' not in key
]
map_hover = HoverTool(tooltips=tooltips, line_policy='next')
x = lsource.to_df()['x']
y = lsource.to_df()['y']
xmin = np.min(np.min(x))
xmax = np.max(np.max(x))
ymin = np.min(np.min(y))
ymax = np.max(np.max(y))
bound = 100 # meters
p = figure(x_range=(xmin, xmax), y_range=(ymin, ymax), plot_height=550, plot_width=900, tools=basic_tools, title='Data check {:s}'.format(tag))
p.axis.visible = False
p.add_tile(bkg_tile)
p.multi_line(xs='x', ys='y', color='white', alpha=0.8, line_width=4., source=lsource)
p_update = p.multi_line(xs='x', ys='y', color={'field': flag_tag, 'transform': dq_color_mapper}, alpha=0.5, line_width=2., source=lsource)
color_bar = ColorBar(orientation='horizontal', color_mapper=dq_color_mapper, height=10, width=150, location=(80, 100))
p.add_layout(color_bar)
p.add_tools(map_hover)
t = show(p, notebook_handle=True)
Below, you can select a different tag to check. The interactive plot shown above will be updated on the fly using the new tag.
# below, you can
tag = 'osm_id'
flag_tag = tag + '_flag'
p_update.glyph.line_color = {'field': flag_tag, 'transform': dq_color_mapper}
p.title.text = 'Data check {:s}'.format(tag)
push_notebook(handle=t)
tooltips=[
(key, '@{:s}'.format(key)) for key in lsource.to_df().keys() if key not in ['x', 'y'] if '_flag' not in key
]
print tooltips
We make an interactive plot of the crossings. The roads and waterways are plotted in yellow and blue respectively. Crossings are plotted as dots. Green dots have a valid crossing mapped (flag=0). Orange dots have no information about the crossing (flag=1). When you hover over the dot you see all information, including the type of crossing mapped at that location.
report_folder = os.path.abspath(os.path.join('..', 'sample_data', 'report_files'))
crossing_fn = os.path.join(report_folder, 'crossings_geo.json')
waterways_fn = os.path.join(report_folder, 'crossings_geo_water.json')
highways_fn = os.path.join(report_folder, 'crossings_geo_roads.json')
# first open the point data with checks, and convert to a columnDataSource...
ds = fiona.open(crossing_fn)
psource = make_point_cds_WMTS(ds)
ds.close()
#...then open the line layers and also convert these to a ColumnDataSource
ds = fiona.open(highways_fn)
lsource_highway = make_line_cds_WMTS(ds)
ds.close()
ds = fiona.open(waterways_fn)
lsource_waterway = make_line_cds_WMTS(ds)
ds.close()
output_notebook()
colors = [Category10[3][-1], Category10[3][-2]]
color_mapper = LinearColorMapper(low=0, high=1, palette=colors)
# we use the Ramani Huria drone background map as placeholder for our data...but you can also use OpenStreetMap.
# We provide a placeholder for that below as well
url='http://b.tiles.mapbox.com/v3/worldbank-education.pebkgmlc/{z}/{x}/{y}.png'
# this one is not used, unquote the bkg_tile = WMTSTileSource(url=url) to visualize OSM instead of the drone data
bkg_tile = WMTSTileSource(
url='http://c.tile.openstreetmap.org/{Z}/{X}/{Y}.png',
attribution=(
'Map tiles by <a href="http://stamen.com">Stamen Design</a>, '
'under <a href="http://creativecommons.org/licenses/by/3.0">CC BY 3.0</a>.'
'Data by <a href="http://openstreetmap.org">OpenStreetMap</a>, '
'under <a href="http://www.openstreetmap.org/copyright">ODbL</a>'
)
)
bkg_tile = WMTSTileSource(url=url)
basic_tools = 'pan, wheel_zoom,box_zoom,reset'
tooltips=[
(key, '@{:s}'.format(key)) for key in psource.to_df().keys() if key not in ['x', 'y']
]
map_hover = HoverTool(tooltips=tooltips)
x = psource.to_df()['x']
y = psource.to_df()['y']
xmin = np.min(x)
xmax = np.max(x)
ymin = np.min(y)
ymax = np.max(y)
bound = 100 # meters
p = figure(x_range=(xmin, xmax), y_range=(ymin, ymax), plot_height=550, plot_width=900, tools=basic_tools)
p.axis.visible = False
p.add_tile(bkg_tile)
p.multi_line(xs='x', ys='y', color='white', alpha=0.5, line_width=4., source=lsource_waterway)
p.multi_line(xs='x', ys='y', color='white', alpha=0.5, line_width=4., source=lsource_highway)
p.multi_line(xs='x', ys='y', color='#6699cc', alpha=0.8, line_width=2., source=lsource_waterway)
p.multi_line(xs='x', ys='y', color='#ffcc33', alpha=0.8, line_width=2., source=lsource_highway)
p.scatter(x='x', y='y', fill_color={'field': 'flag', 'transform': color_mapper}, line_color='white', source=psource, alpha=1., size=10)
color_bar = ColorBar(orientation='horizontal', color_mapper=color_mapper, height=10, width=150, location=(80, 100))
p.add_layout(color_bar)
p.add_tools(map_hover)
show(p)
We show how well the network is connected to one outlet point. We have used the outlet with osmid = 143878371 to check the connectivity of the network. We again make an interactive plot of the connectivity. Connected waterways are plotted in blue, while unconnected waterways are in orange. The outlet waterway is also colored in red.
report_folder = os.path.abspath(os.path.join('..', 'sample_data', 'report_files'))
connectivity_fn = os.path.join(report_folder, 'connectivity_osmid_143878371.json')
# first open the point data with checks, and convert to a columnDataSource...
ds = fiona.open(connectivity_fn)
lsource = make_line_cds_WMTS(ds)
ds.close()
tag = 'connected'
connect_feat = '143878371'
df = lsource.to_df()
df_select = df.query('osm_id=={:s}'.format(connect_feat))
lsource_end = ColumnDataSource(df_select)
output_notebook()
colors = [Category10[3][1], Category10[3][0]]
color_mapper = LinearColorMapper(low=0, high=1, palette=colors)
# we use the Ramani Huria drone background map as placeholder for our data...but you can also use OpenStreetMap.
# We provide a placeholder for that below as well
url='http://b.tiles.mapbox.com/v3/worldbank-education.pebkgmlc/{z}/{x}/{y}.png'
# this one is not used, unquote the bkg_tile = WMTSTileSource(url=url) to visualize OSM instead of the drone data
bkg_tile = WMTSTileSource(
url='http://c.tile.openstreetmap.org/{Z}/{X}/{Y}.png',
attribution=(
'Map tiles by <a href="http://stamen.com">Stamen Design</a>, '
'under <a href="http://creativecommons.org/licenses/by/3.0">CC BY 3.0</a>.'
'Data by <a href="http://openstreetmap.org">OpenStreetMap</a>, '
'under <a href="http://www.openstreetmap.org/copyright">ODbL</a>'
)
)
bkg_tile = WMTSTileSource(url=url)
basic_tools = 'pan, wheel_zoom,box_zoom,reset'
tooltips=[
(key, '@{:s}'.format(key)) for key in lsource.to_df().keys() if key not in ['x', 'y']
]
map_hover = HoverTool(tooltips=tooltips, line_policy='next')
x = lsource.to_df()['x']
y = lsource.to_df()['y']
xmin = np.min(np.min(x))
xmax = np.max(np.max(x))
ymin = np.min(np.min(y))
ymax = np.max(np.max(y))
bound = 100 # meters
p = figure(x_range=(xmin, xmax), y_range=(ymin, ymax), plot_height=550, plot_width=900, tools=basic_tools)
p.axis.visible = False
p.add_tile(bkg_tile)
p.multi_line(xs='x', ys='y', color='white', alpha=0.8, line_width=4., source=lsource)
p_update = p.multi_line(xs='x', ys='y', color={'field': tag, 'transform': color_mapper}, alpha=0.5, line_width=2., source=lsource)
p.multi_line(xs='x', ys='y', color='white', alpha=0.8, line_width=8., source=lsource_end)
p.multi_line(xs='x', ys='y', color='red', alpha=0.5, line_width=6., source=lsource_end)
color_bar = ColorBar(orientation='horizontal', color_mapper=color_mapper, height=10, width=150, location=(80, 100))
p.add_layout(color_bar)
p.add_tools(map_hover)
show(p)